28 research outputs found
Emergent user behavior on Twitter modelled by a stochastic differential equation
Data from the social-media site, Twitter, is used to study the fluctuations
in tweet rates of brand names. The tweet rates are the result of a strongly
correlated user behavior, which leads to bursty collective dynamics with a
characteristic 1/f noise. Here we use the aggregated "user interest" in a brand
name to model collective human dynamics by a stochastic differential equation
with multiplicative noise. The model is supported by a detailed analysis of the
tweet rate fluctuations and it reproduces both the exact bursty dynamics found
in the data and the 1/f noise
Correlations Between Human Mobility and Social Interaction Reveal General Activity Patterns
A day in the life of a person involves a broad range of activities which are
common across many people. Going beyond diurnal cycles, a central question is:
to what extent do individuals act according to patterns shared across an entire
population? Here we investigate the interplay between different activity types,
namely communication, motion, and physical proximity by analyzing data
collected from smartphones distributed among 638 individuals. We explore two
central questions: Which underlying principles govern the formation of the
activity patterns? Are the patterns specific to each individual or shared
across the entire population? We find that statistics of the entire population
allows us to successfully predict 71\% of the activity and 85\% of the
inactivity involved in communication, mobility, and physical proximity.
Surprisingly, individual level statistics only result in marginally better
predictions, indicating that a majority of activity patterns are shared across
{our sample population}. Finally, we predict short-term activity patterns using
a generalized linear model, which suggests that a simple linear description
might be sufficient to explain a wide range of actions, whether they be of
social or of physical character
Measure of Node Similarity in Multilayer Networks
The weight of links in a network is often related to the similarity of the
nodes. Here, we introduce a simple tunable measure for analysing the similarity
of nodes across different link weights. In particular, we use the measure to
analyze homophily in a group of 659 freshman students at a large university.
Our analysis is based on data obtained using smartphones equipped with custom
data collection software, complemented by questionnaire-based data. The network
of social contacts is represented as a weighted multilayer network constructed
from different channels of telecommunication as well as data on face-to-face
contacts. We find that even strongly connected individuals are not more similar
with respect to basic personality traits than randomly chosen pairs of
individuals. In contrast, several socio-demographics variables have a
significant degree of similarity. We further observe that similarity might be
present in one layer of the multilayer network and simultaneously be absent in
the other layers. For a variable such as gender, our measure reveals a
transition from similarity between nodes connected with links of relatively low
weight to dis-similarity for the nodes connected by the strongest links. We
finally analyze the overlap between layers in the network for different levels
of acquaintanceships.Comment: 12 pages, 4 figure
The memory remains: understanding collective memory in the digital age
Recently developed information communication technologies, particularly the Internet, have affected how we, both as individuals and as a society, create, store, and recall information. The Internet also provides us with a great opportunity to study memory using transactional large-scale data in a quantitative framework similar to the practice in natural sciences. We make use of online data by analyzing viewership statistics of Wikipedia articles on aircraft crashes. We study the relation between recent events and past events and particularly focus on understanding memory-triggering patterns. We devise a quantitative model that explains the flow of viewership from a current event to past events based on similarity in time, geography, topic, and the hyperlink structure of Wikipedia articles. We show that, on average, the secondary flow of attention to past events generated by these remembering processes is larger than the primary attention flow to the current event. We report these previously unknown cascading effects
An alternative approach to the limits of predictability in human mobility
Abstract Next place prediction algorithms are invaluable tools, capable of increasing the efficiency of a wide variety of tasks, ranging from reducing the spreading of diseases to better resource management in areas such as urban planning. In this work we estimate upper and lower limits on the predictability of human mobility to help assess the performance of competing algorithms. We do this using GPS traces from 604 individuals participating in a multi year long experiment, The Copenhagen Networks study. Earlier works, focusing on the prediction of a participant’s whereabouts in the next time bin, have found very high upper limits ( > 90 % ). We show that these upper limits are highly dependent on the choice of a spatiotemporal scales and mostly reflect stationarity, i.e. the fact that people tend to not move during small changes in time. This leads us to propose an alternative approach, which aims to predict the next location, rather than the location in the next bin. Our approach is independent of the temporal scale and introduces a natural length scale. By removing the effects of stationarity we show that the predictability of the next location is significantly lower (71%) than the predictability of the location in the next bin
Asymmetry in relations.
<p>We show the max likelihood distribution of the initiative parameter across relationships. Here <i>μ</i> = 0.5 corresponding to a fully symmetric relationship, while <i>μ</i> = 0 is a fully asymmetric one. The full line corresponds to the max likelihood distribution as it is derived from the data. We also test the method for bias and uncertainty by applying it to synthetic data sets. The dashed line correspond to the average estimate among the synthetic data sets, and the error bars correspond to the spread in these estimates. We suspect that many of the highly asymmetric relations at <i>μ</i> = 0 are of non-social character.</p
Interpretation of the update formula in Eq (2).
<p>If the signal at some point takes the value <i>γ</i><sub>0</sub>, then a small time step, <i>dt</i>, later, its value will be realized from a Gaussian with a mean determined by <i>f</i>(<i>γ</i><sub>0</sub>) and a spread determined by <i>g</i>(<i>γ</i><sub>0</sub>). By performing statistics over many such realizations we may therefore obtain the drift and diffusion.</p
Plots of the probability density function (A) and power spectrum (B) for a simulation of the model in Eq (16).
<p>Note that the power law exponents of -1 and -3 match those of the data.</p